The Necessity of Rehashing
To guarantee the desirable average-case performance for search and insertion, the Load Factor () must be strictly capped, where is the number of elements and is the table capacity.
If is allowed to grow indefinitely, collisions increase exponentially, and the average time complexity degrades toward .
| Condition | Action Triggered | Impact |
|---|---|---|
| Standard insertion | Optimal efficiency maintained. | |
| Resizing (Rehashing) | Restores performance, but incurs temporary cost. |
Common Thresholds (): 0.70 to 0.75.
The Resizing Process
Resizing requires recalculating the hash index for every single item currently in the table, a process known as Rehashing.
- New Capacity Determination: Select a new capacity , usually double the current capacity (). This ensures the new is half of the critical threshold.
- Table Creation: Allocate a new hash table array of size .
- Item Iteration: Loop through all existing elements in the old table.
- Re-hashing: For each key , calculate the new index using the updated modulus:
- Insertion: Insert the item into the new table at .
Note: Since the modulus changes, simply copying the array is impossible; every item must be re-inserted.
Amortized Cost
Why Resizing is
Resizing requires processing all elements, meaning the operation itself takes time, which temporarily violates the goal of insertion.
Amortized Analysis
We use Amortized Analysis to justify this cost. If the table doubles its size every time it resizes (exponential growth), the expensive cost is spread out over a large number of intervening insertions.
The average cost of any single insertion, factoring in the periodic resize, remains .
1. A hash map has a capacity and a max load factor . At what number of elements () must a resize be triggered?
2. During a resize, why can't we simply copy elements from the old table to the new, larger table?
3. What is the amortized time complexity of an insertion in a hash table that doubles its size when resizing?
4. What is the primary consequence of not resizing a hash table when its load factor becomes too high?